Derivation of studio camera position and motion from the camera image

ABSTRACT

Studio camera position and motion may be derived from the camera image by separating out the background and deriving from a background having a number of areas of hue and/or brightness different from adjacent areas estimates of movement from one image to the next. The initial image is used as a reference and amended with predicted motion value. The amended image is compared with incoming images and the result used to derive translation and scale change information. Once the proportion of the reference image contained in an incoming image falls below a threshold a fresh reference image is adopted.

FIELD OF THE INVENTION

This invention relates to the derivation of information regarding theposition of a television camera from image data acquired by the camera.

BACKGROUND TO THE INVENTION

In television production, it is often required to video live action inthe studio and electronically superimpose the action on a backgroundimage. This is usually done by shooting the action in front of a bluebackground and generating a ‘key’ from the video signal to distinguishbetween foreground and background. In the background areas, the chosenbackground image can be electronically inserted.

One limitation to this technique is that the camera in the studio cannotmove, since this would generate motion of the foreground withoutcommensurate background movement. One way of allowing the camera to moveis to use a robotic camera mounting that allows a predefined cameramotion to be executed, the same camera motion being used when thebackground images are shot. However the need for predefined motionplaces severe artistic limitations on the production process.

Techniques are currently under development that aim to be able togenerate electronically background images that can be changed as thecamera is moved so that they are appropriate to the present cameraposition. Thus a means of measuring the position of the camera in thestudio is required. One way in which this can be done is to attachsensors to the camera to determine its position and angle of view;however the use of such sensors is not always practical.

The problem being addressed here is a method to derive the position andmotion of the camera using only the video signal from the camera. Thusit can be used on an unmodified camera without special sensors.

DESCRIPTION OF PRIOR ART

The derivation of the position and motion of a camera by analysis of itsimage signal is a task often referred to as passive navigation; thereare many examples of approaches to this problem in the literature, themore pertinent of which are as follows:

1. Brandt et al. 1990. Recursive motion estimation based on a model ofthe camera dynamics.

2.1. Brandt, A., Karmann, K., Lanser, S. Signal Processing V: Theoriesand Applications (Ed. Torres, L. et al.), Elsevir, pp. 959-962, 1990.

3.2. Buxton et al 1985 Machine perception of visual motion. Buxton, B.F., Buxton, H., Murray, D. W., Williams, N. S. GEC Journal of Research,Vol. 3 No. 3, pp. 145-161.

4.3. Netravali and Robbins 1979 Motion-compensated television coding:Part 1. Netravali, A. N., Robbins, J. D. Bell System Technical JournalVol. 58, No. 3, Mar. 1979, pp. 631-670.

5.4. Thomas 1987 Television motion measurement for DATV and otherapplications. Thomas, G. A. BBC Research Department Report No. 1987/11.

6.5. Uomori et al. 1992 Electronic image stabilisation system for videocameras and VCRs. Uomori, K., Morimura, A., Ishii, J. SMPTE Journal,Vol. 101 No. 2, pp. 66-75, Feb. 1992.

7.6. Wu and Kittel Kittler 1990 Wu, S. F., Kittel, J. 1990 . Adifferential method for simultaneous estimation of rotation, change ofscale and translation. Signal Processing: Image Communication 2,Elsevier, 1990, pp. 69-80.

For example, if a number of feature points can be identified in theimage and their motion tracked from frame to frame, it is possible tocalculate the motion of the camera relative to these points by solving anumber of non-linear simultaneous equations [Buxton et al. 1985]. Thetracking of feature points is often achieved by measuring the opticalflow (motion) field of the image. This can be done in a number of ways,for example by using an algorithm based on measurements of thespariotemporal luminance gradient of the image [Netraveli Netravali andRobbins 1979].

A similar method is to use Kalman filtering techniques to estimate thecamera motion parameters from the optical flow field and depthinformation [Brandt et al; 1990].

However, in order to obtain reliable (relative noise-free) informationrelating to the motion of the camera, it is necessary to have a goodnumber of feature points visible at all times, and for these to bedistributed in space in an appropriate manner. For example, if allpoints are at a relatively large distance from the camera, the effect ofa camera pan (rotation of the camera about the vertical axis) willappear very similar to that of a horizontal translation at right anglesto the direction of view. Points at a range of depth are thus requiredto distinguish reliably between these types of motion.

Simpler algorithms exist that allow a sub-set of camera motionparameters to be determined, while placing less constraints on the scenecontent. For example, measurement of horizontal and vertical imagemotions such a those caused by camera panning and tilting can bemeasured relatively simply for applications such as the steadying ofimages in hand-held cameras [Uomori et al. 1992].

SUMMARY OF THE INVENTION

In order to derive all required camera parameters (three spatialcoordinates, pan and tilt angles and degree of zoom) from analysis ofthe camera images, a large number of points in the image would have tobe identified and tracked. Consideration of the operational constraintsin a TV studio suggested that providing an appropriate number ofwell-distributed reference points in the image would be impractical:markers would have to be placed throughout the scene at a range ofdifferent depths in such a way that at a significant number were alwaysvisible, regardless of the position of the camera or actors.

We have appreciated that measurements of image translation and scalechange are relatively easy to make; from these measurements it is easyto calculate either

1. pan, tilt and zoom under the assumption that the camera is mounted ona fixed tripod: the scale change is a direct indication of the amount bywhich the degree of camera zoom has changed, and the horizontal andvertical translation indicate the change in pan and tilt angles; or

2. horizontal and vertical movement under the assumption that the camerais mounted in such a way that it can move in three dimensions (butcannot pan or tilt) and is looking in a direction normal to a planarbackground: the scale change indicates the distance the camera has movedalong the optical axis and the image translation indicates how far thecamera has moved normal to this axis.

This approach does not require special markers or feature points in theimage, merely sufficient detail to allow simple estimation of globalmotion parameters. Thus it should be able to work with a wide range ofpicture material. All that is required is measurement of the initialfocal length (or angle subtended by the field of view) and the initialposition and angle of view of the camera.

The invention is defined by the independent claims to which referenceshould be made. Preferred features are set out in the dependent claims.

The approach described may be extended to more general situations(giving more freedom on the type of camera motion allowed) if otherinformation such as image depth could be derived [Brandi et al. 1990].Additional information from some sensors on the camera (for example tomeasure the degree of zoom) may allow more flexibility.

In order to allow the translation and scale change of the image to bemeasured, there must be sufficient detail present in the background ofthe image. Current practice is usually based upon the use of a bluescreen background, to allow a key signal to be generated by analysingthe RGB values of the video signal. Clearly, a plain blue screen cannotbe used if camera motion information is to be derived from the image,since it contains no detail. Thus it will be necessary to use abackground that contains markings of some sort, but is still of asuitable form to allow a key signal to be generated.

One form of background that is being considered is a ‘checkerboard’ ofsquares of two similar shades of blue, each closely resembling the bluecolour used at present. This should allow present keying techniques tobe used, while providing sufficient detail to allow optical flowmeasurements to be made. Such measurements could be made on a signalderived from an appropriate weighted sum of RGB values designed toaccentuate the differences between the shades of blue.

The key signal may be used to remove foreground objects from the imageprior to the motion estimation process. Thus the motion of foregroundobjects will not confuse the calculation.

FIG. 1 shows in block schematic form the basic arrangement of a cameramotion estimator system embodying the invention;

FIG. 2 illustrates the relationship of measurement points in current andreference images;

FIG. 3 is a schematic view showing the displacement of a givenmeasurement point from a reference image to the current image; and

FIG. 4 shows a checkerboard background.

DESCRIPTION OF BEST MODE

The algorithm chosen for measuring global translation and scale changemust satisfy the following criteria:

1. The chosen algorithm cannot be too computationally intensive, sinceit must run in real-time;

2. It must be capable of highly accurate measurements, since measurementerrors will manifest themselves as displacement errors betweenforeground and background;

3. Measurement errors should not accumulate to a significant extent asthe camera moves further away from its starting point.

Embodiment: Motion Estimation Followed by Global Motion ParameterDetermination

An example of one type of algorithm that could be used is one based on arecursive spario-temporal gradient technique described in reference 4[Netravali and Robbins 1979]. This kind of algorithm is known to becomputationally efficient and to be able to measure small displacementsto a high accuracy. Other algorithms based on block matching describedin reference 6 [Uomori et al. 1992] or phase correlation described inreference 5 [Thomas 1987] may also be suitable.

The algorithm may be used to estimate the motion on a sample-by-samplebasis between each new camera image and a stored reference image. Thereference image is initially that viewed by the camera at the start ofthe shooting, when the camera is in a known position. Before eachmeasurement, the expected translation and scale change is predicted fromprevious measurements and the reference image is subject to atranslation and scale change by this estimated amount. Thus the motionestimation process need only measure the difference between the actualand predicted motion.

The motion vector field produced is analyzed to determine the horizontaland vertical displacement and scale change. This can be done byselecting a number of points in the vector field likely to have accuratevectors (for example in regions having both high image detail anduniform vectors). The scale change can be determined by examining thedifference between selected vectors as a function of the spatialseparation of the points. The translation can then be determined fromthe average values of the measured vectors after discounting the effectof the scale change. The measured values are added to the estimatedvalues to yield the accumulated displacement and scale change for thepresent camera image.

More sophisticated methods of analysing the vector field could be addedin future, for example in conjunction with means for determining thedepth of given image points, to extend the flexibility of the system.

As the accumulated translation and scale change get larger, thetranslated reference image will begin to provide a poor approximation tothe current camera image. For example, if the camera is panning to theright, picture material on the right of the current image will not bepresent in the reference image and so no motion estimate can be obtainedfor this area. To alleviate this problem, once the accumulated valuesexceed a given threshold the reference image is replaced by the presentcamera image. Each time this happens however, measurement errors willaccumulate.

All the processing will be carried out on images that have beenspatially filtered and subsampled. This will reduce the amount ofcomputation required, with no significant loss in measurement accuracy.The filtering process also softens the image; this is known to improvethe accuracy and reliability of gradient-type motion estimators. Furthercomputational savings can be achieved by carrying out the processingbetween alternate fields rather than for every field; this will reducethe accuracy with which rapid acceleration can be tracked but this isunlikely to be a problem since most movements of studio cameras tend tobe smooth.

Software to implement the most computationally-intensive pans of theprocessing has been written and benchmarked, to provide information toaid the specification and design of the hardware accelerator. Thebenchmarks showed that the process of filtering and down-sampling theincoming images is likely to use over half of the total computationtime.

Embodiment 2: Direct Estimation of Global Motion Parameters Analternative and preferred method of determining global translation andscale change is to derive them directly from the video signal. A methodof doing this is described in reference 7 by [Wu and Kittel Kittler1990]. We have extended this method to work using a stored referenceimage and to use the predicted motion values as a staging point.Furthermore, the technique is applied only at a sub-set of pixels in theimage, that we have termed measurement points, in order to reduce thecomputational load. As in the previous embodiment the RGB video signalis matrixed to form a single-component signal and spatially low-passfiltered prior to processing. As described previously, only areasidentified by a key signal as background are considered.

The method is applied by considering a number of measurement points ineach incoming image and the corresponding points in the reference image,displaced according to the predicted translation and scale change. Thesepredicted values may be calculated, for example, by linear extrapolationof the measurements made in the preceding two images. The measurementpoints may be arranged as a regular array, as shown in FIG. 2. A moresophisticated approach would be to concentrate measurement points inareas of high luminance gradient, to improve the accuracy when a limitednumber of points are used. We have found that 500-1000 measurementpoints distributed uniformly yields good results. Points falling in theforeground areas (as indicated by the key signal) are discarded, sinceit is the motion of the background that it is to be determined.

At each measurement point, luminance gradients are calculated as shownin FIG. 3. These may be calculated, for example, by simply taking thedifference between pixels either side of the measurement point. Spatialgradients are also calculated for the corresponding point in thereference image, offset by the predicted motion. Sub-pixel interpolationmay be employed when calculating these values. The temporal luminancegradient is also calculated; again sub-pixel interpolation may be usedin the reference image. An equation is formed relating the measuredgradients to the motion values as follows:

Gradients are (approximately) related to displacement and scale changesby the equation

g_(x)X+g_(y)Y+(Z−1).(g_(x)x+g_(y)y)=g_(t)

where

g_(x)=(gr_(x)+gc_(x))/2

g_(y)=(gr_(y)+gc_(y))/2

are the horizontal and vertical luminance gradients averaged between thetwo images:

gc_(x), gc_(y) are horizontal and vertical luminance gradients incurrent image;

gr_(x), gr_(y) are horizontal and vertical luminance gradients inreference image; and

g_(t) is the temporal luminance gradient.

X and Y are the displacements between current and reference image and Zis the scale change (over and above those predicted).

An equation is formed for each measurement point and a least-squaressolution is calculated to obtain values for X, Y, and Z.

Derivation of the equation may be found in reference 7 [ Wu and Kinel1990] Kittler (this reference includes the effect of image rotation; wehave omitted rotation since it is of little relevance here as studiocameras tend to be mounted such that they cannot rotate about the opticaxis).

The set of simultaneous linear equations derived in this way (one foreach measurement point) is solved using a standard least-squaressolution method to yield estimates of the difference between thepredicted and the actual translation and scale change. The calculatedtranslation values are then added to the predicted values to yield theestimated translation between the reference image and the current image.

Similarly, the calculated and predicted scale changes are multipliedtogether to yield the estimated scale change. The estimated values thuscalculated are then used to derive a prediction for the translation andscale change of the following image.

As described earlier, the reference image is updated when the camera hasmoved sufficiently far away from its initial position. This automaticrefreshing process may be triggered, for example, when the area ofoverlap between the incoming and reference image goes below a giventhreshold. When assessing the area of overlap, the key signal needs tobe taken account of, since for example an actor who obscured the lefthalf of the background in the reference image might move so that heobscures the right half, leaving no visible background in common betweenincoming and reference images. One way of measuring the degree ofoverlap is to count the number of measurement points that are usable(ie. that fall in visible background areas of both the incoming andreference image). This number may be divided by the number ofmeasurement points that were usable when the reference image was firstused to obtain a measure of the usable image area as a fraction of themaximum area obtainable with that reference image. If the initial numberof usable points in a given reference image was itself below a giventhreshold, this would indicate that most of the image was taken up withforeground rather than background, and a warning message should beproduced.

It can also be advantageous to refresh the reference image if themeasured scale change exceeds a given range (eg, if the camera zooms ina long way). Although in this situation the number of usable measurementpoints may be very high, the resolution of the stored reference imagecould become inadequate to allow accurate motion estimation.

When the reference image is updated, it can be retained in memory forfuture use, together with details of its accumulated displacement andscale change. When a decision is made that the current reference imageis no longer appropriate, the stored images can be examined to seewhether any of these gave a suitable view of the scene. This assessmentcan be carried out using similar criteria to those explained above. Forexample, if the camera pans to the left and then back to its startingposition, the initial reference image may be re-used as the cameraapproachis this position. This ensures the measurements of cameraorientation made at the end of the sequence will be as accurate as thosemade at the beginning.

Referring back to FIG. 1, apparatus for putting each of the two motionestimation methods into practice is shown. A camera 10, derives a videosignal from the background 12 which, as described previously, may bepatterned in two tones as shown in FIG. 4. The background cloth shown inFIG. 4 shows a two-tone arrangement of squares. Squares 30 of one toneare arranged adjacent sequences 32 of the other tone. Shapes other thansquares may be used and it is possible to use more than two differenttones. Moreover, the tones may differ in both hue and brightness or ineither hue or brightness. At present, it is considered preferable forthe brightness to be constant as variations in brightness might show inthe final image.

Although the colour blue is the most common for the backcloth othercolours, for example, green or orange are sometimes used whenappropriate. The technique described is not peculiar to any particularbackground colour but requires a slight variation in hue and/orbrightness between a number of different areas of the background, andthat the areas adjacent to a given area have a brightness and/or huedifferent from that of the given area. This contrast enables motionestimation from the background to be performed.

Red, green and blue (RGB) colour signals formed by the camera arematrixed into a single colour signal and applied to a spatial low-passfiler (at 14). The low-pass output is applied to an image store 16 whichholds the reference image data and whose output is transformed at 18 byapplying the predicted motion for the image. The motion adjustedreference image data is applied, together with the low-pass filteredimage to a unit 20 which measures the net motion in background areasbetween an incoming image at input I and a stored reference image atinput R. The unit 20 applies one of the motion estimation algorithmsdescribed. The net motion measurement is performed under the control ofa key signal K derived by a key generator 22 from the unfiltered RGBoutput from the camera 10 to exclude foreground portions of the imagefrom the measurement. The motion prediction signal is updated on thebasis of previous measured motion thus ensuring that the output from theimage store 16 is accurately interpolated. When, as discussedpreviously, the camera has moved sufficiently away from its initialposition a refresh signal 24 is sent from the net motion measurementunit 20 to the image store 16. On receipt of the refresh signal 24 afresh image is stored in the image store and used as the basis forfuture net motion measurements.

The output from the net motion measurement unit 20 is used to derive anindication of current camera position and orientation as discussedpreviously.

Optionally, sensors 26 mounted on the camera can provide data to the netmotion measurement unit 20 which augment or replace the image-derivedmotion signal.

The image store 16 may comprise a multi-frame store enabling storage ofprevious reference images as well as the current reference image.

The technique described can also be applied to image signals showingarbitrary picture material instead of just the blue background describedearlier. If objects are moving in the scene, these can be segmented outby virtue of their motion rather than by using a chroma-key signal. Thesegmentation could be performed, for example, by discounting anymeasurement points for which the temporal luminance gradient (aftercompensating for the predicted background motion) was above a certainthreshold. More sophisticated techniques for detecting motion relativeto the predicted background motion can also be used.

It will be understood that the techniques described may be implementedeither by special purpose digital signal processing equipment, bysoftware in a computer, or by a combination of these methods. It willalso be clear that the technique can be applied equally well to anytelevision standard.

I claim:
 1. A method of measuring the translation and scale change in asequence of video images, the method comprising: storing a first imagein said sequence to form a stored image; forming a prediction of thetranslation and scale change from said stored image to a further imagein said sequence; comparing said stored image to said further image bytransforming at least one of said stored image byand said further imagebased on said prediction of the translation and scale change; to form atransformed first image; comparing a further image in said sequence withsaid transformed first image; deriving measurements of translation andscale change between said stored image and said further image from saidcomparison and said prediction; and replacing said first image with anew incoming image when the image area common to both said first imageand incoming image falls below a given proportion of the whole imagearea of said incoming image.
 2. The method of claim 1, wherein only thebackground areas of said video images are used in the measurement oftranslation and scale change.
 3. The method of claim 2, wherein a signalis used to separate foreground and background and said signal is derivedusing chroma-key techniques.
 4. The method of claim 2, wherein a signalis used to separate foreground and background portions of the images andsaid signal is derived using motion detection methods to identifyobjects moving with a different motion from that predicted for thebackground.
 5. The method of claim 1, wherein: each image is asingle-component signal derived from a camera viewing a scene containinga background of near-uniform color; said background is divided into aplurality of areas each having a hue and/or brightness different fromthe hue and/or brightness of adjacent areas to allow the generation of akey signal by chroma-key techniques; and wherein said single-componentsignal is formed from a three-component camera signal so to accentuatedifferences in hue and/or brightness of individual areas of saidbackground to enable motion estimation.
 6. A method of measuring thetranslation and scale change in a sequence of video images of a camerasignal derived by a camera, comprising: storing a first image of saidsequence to form a stored image said image comprising a single componentsignal derived by a camera viewing a scene containing a background ofnear-uniform color, and said background being divided into a pluralityof areas each having a hue and/or brightness different from the hueand/or brightness of adjacent areas to allow generation of a key signalby chroma-key techniques; forming said single-component signal from saidcamera signal so as to accentuate differences in hue and/or brightnessof individual areas of the background to enable motion estimation;forming a prediction of the translation and scale change from saidstored image to a further image in said sequence; comparing said storedimage to said further image by transforming at least one of said storedimage byand said further image based on said prediction of thetranslation and scale change; to form a transformed first image;comparing a further image in said sequence with said transformed firstimage; and deriving measurements of translation and scale change betweensaid stored image and said further image from said comparison and saidprediction.
 7. The method of claim 6, wherein said background is dividedinto a plurality of areas, each area having one of two hues and/orbrightnesses.
 8. The method of claim 6, wherein said areas of thebackground are square.
 9. The method of claim 1 or 6, wherein saidtranslation and scale change are predicted by the computation of anumber of simultaneous equations each of which relate said translationand scale change to spatial and temporal gradients at a point in thereference image and which are solved to yield a least-squares solutionfor the motion parameters.
 10. The method of claim 1 or 6, comprising:selecting a number of measurement points in said first image for motionestimation; and replacing said first image with a new incoming imagewhen the number of measurement points which lie in areas of backgroundvisible in both said first image and a given incoming image falls belowa given proportion of the total number of measurement points.
 11. Themethod of claim 1 or 6, comprising replacing said first image if thescale change between an incoming image and the reference image exceeds agiven factor.
 12. The method of claim 1 or 6, comprising spatiallyprefiltering said first image prior to storage and comparison.
 13. Themethod of claim 10, wherein said measurement points lie in a regulararray in the image.
 14. The method of claim 10 wherein said measurementpoints are chosen to lie at points of high spatial gradient.
 15. Themethod of claim 1 or 6, comprising storing replaced reference images forlater use.
 16. Apparatus for measuring the translation and scale changein a sequence of video images, comprising: means for acquiring saidsequence of images; storage means for storing a first image in saidsequence to form a stored image; means for forming a prediction of thetranslation and scale change from said stored image to a further imagein said sequence; means for comparing said stored image to said furtherimage by transforming at least one of said stored image by and saidfurther image based on said prediction of the translation and scalechange to form a transformed first image; means for comparing saidtransformed first image with a further image in the sequence ; means forderiving measurements of translation and scale change between saidstored image and said further image from said comparison and saidprediction; and means for replacing said first image with a new incomingimage when the image area common to both said first image and incomingimage falls below a given proportion of the whole image area of saidincoming images.
 17. The apparatus of claim 16, wherein said means forderiving operates only on the background areas of the images, said meansfor deriving comprising means for separating foreground and backgroundportions of the images using motion techniques to identify objectsmoving with a different motion from that predicted for the background.18. The apparatus of claim 16, wherein said means for deriving operatesonly on the background areas of the images, and said means for derivingcomprising a key generator for generating a chroma-key to separateforeground and background.
 19. The apparatus of claim 16, comprising:means for generating a single-component signal from a camera viewing ascene containing a background of near-uniform color, said backgroundbeing divided into a plurality of areas each having a hue and/orbrightness different from the hue and/or brightness of adjacent areas toallow the generation of a key signal by chroma-key techniques, andwherein the single-component signal is formed from a three-componentcamera signal so as to accentuate differences in hue and/or brightnessof individual areas of the background to enable motion estimation. 20.Apparatus for measuring the translation and scale change in a sequenceof video images derived by a camera, comprising: means for storing afirst image of said sequence to form a stored image, said imagecomprising a single component signal derived by a camera viewing a scenecontaining a background of near-uniform color, said background beingdivided into a plurality of areas each having a hue and/or brightnessdifferent from the hue and/or brightness of adjacent areas to allowgeneration of a key signal by chroma-key techniques; means for formingsaid single-component signal from said camera signal so as to accentuatedifferences in hue and/or brightness of individual areas of thebackground to enable motion estimation; means for forming a predictionof the translation and scale change from said stored image to a furtherimage in said sequence; means for comparing said stored image to saidfurther image by transforming at least one of the stored image by a andsaid further image based on said prediction of the translation and scalechange to form a transformed first image; means for comparing a furtherimage in said sequence with said transformed first image; and means forderiving, from said comparison and said prediction, measurements oftranslation of scale change between said stored image and said furtherimage.
 21. The apparatus of claim 19 or 20, wherein said background isdivided into a plurality of areas each area having one or two huesand/or brightnesses.
 22. The apparatus of claim 19 or 20, wherein saidareas of the background are square.
 23. The apparatus of claim 19 or 20,comprising: means for predicting the translation and scale change by thecomputation of a number of simultaneous equations each of which relatethe translation and scale change to spatial and temporal gradients at apoint in the first image and which are solved to yield a least-squaressolution for the motion parameters.
 24. The apparatus of claim 19 or 20,wherein the replacing means comprises: means for selecting a number ofmeasurement points in the first image for motion estimation and forreplacing the first image with a new incoming image when the number ofmeasurement points which lie in areas of background visible in both thefirst image and a given incoming image falls below a given proportion ofthe total number of measurement points.
 25. The apparatus of claim 16 or20, comprising a spatial filter for filtering the images prior tostorage and comparison.
 26. The apparatus of claim 16 or 20, comprisinga further storage means for storing a replaced reference image forfuture use.
 27. The method of claim 1 wherein said step of comparingsaid stored image to said further image comprises substeps of:transforming said stored image by said prediction of the translation andscale change to form a transformed first image; and comparing a furtherimage in said sequence with said transformed first image.
 28. The methodof claim 6 wherein said step of comparing said stored image to saidfurther image comprises substeps of: transforming said stored image bysaid prediction of the translation and scale change to form atransformed first image; and comparing a further image in said sequencewith said transformed first image.
 29. The apparatus of claim 16 whereinsaid means for comparing said stored image to said further imagecomprises: means for transforming said stored image by said predictionof the translation and scale change to form a transformed first image;and means for comparing a further image in said sequence with saidtransformed first image.
 30. The apparatus of claim 20 wherein saidmeans for comparing said stored image to said further image comprises:means for transforming said stored image by said prediction of thetranslation and scale change to form a transformed first image; andmeans for comparing a further image in said sequence with saidtransformed first image.
 31. Apparatus for measuring the translation andscale change in a sequence of video images, comprising: an image storageunit that stores a first image in said sequence to form a stored image;a prediction unit that forms a prediction of the translation and scalechange from said stored image to a further image in said sequence; animage transformer that compares said stored image to said further imageby transforming at least one of said stored image and said further imagebased on said prediction of the translation and scale change; ameasurement unit that derives measurements of translation and scalechange between said stored image and said further image from saidcomparison and said prediction; and a refresh signal generator thatreplaces said first image with a new incoming image when the image areacommon to both said first image and incoming image falls below a givenproportion of the whole image area of said incoming images.
 32. Theapparatus of claim 31, wherein said measurement unit operates only onthe background areas of the images, said measurement unit separatingforeground and background portions of the images using motion techniquesto identify objects moving with a different motion from that predictedfor the background.
 33. The apparatus of claim 31, wherein saidmeasurement unit operates only on the background areas of the images,and said measurement unit comprises a key generator for generating achroma-key to separate foreground and background.
 34. The apparatus ofclaim 31, comprising: means for generating a single-component signalfrom a camera viewing a scene containing a background of near-uniformcolor, said background being divided into a plurality of areas eachhaving a hue and/or brightness different from the hue and/or brightnessof adjacent areas to allow the generation of a key signal by chroma-keytechniques, and wherein the single-component signal is formed from athree-component camera signal so as to accentuate differences in hueand/or brightness of individual areas of the background to enable motionestimation.
 35. Apparatus for measuring the translation and scale changein a sequence of video images derived by a camera, comprising: an imagestorage unit that stores a first image of said sequence to form a storedimage, said image comprising a single component signal derived by acamera viewing a scene containing a background of near-uniform color,said background being divided into a plurality of areas each having ahue and/or brightness different from the hue and/or brightness ofadjacent areas to allow generation of a key signal by chroma-keytechniques; a signal former that forms said single-component signal fromsaid camera signal so as to accentuate differences in hue and/orbrightness of individual areas of the background to enable motionestimation; a prediction unit that forms a prediction of the translationand scale change from said stored image to a further image in saidsequence; an image transformer that compares said stored image to saidfurther image by transforming at least one of the stored image and saidfurther image based on said prediction of the translation and scalechange; and a measurement unit that derives, from said comparison andsaid prediction, measurements of translation of scale change betweensaid stored image and said further image.
 36. The apparatus of claim 35wherein said background is divided into a plurality of areas each areahaving one or two hues and/or brightnesses.
 37. The apparatus of claim35, wherein said areas of the background are square.
 38. A method ofderiving a measure of translation and scale change from a camera outputsignal, the method comprising: providing a background with a patterncomprising a plurality of areas of a first tone and a plurality of areasof a second tone arranged so that transitions between areas occur alongat least two axes, whereby the transitions provide reference points fromwhich measures of both translation and scale change may be determined;receiving said camera output signal defining a camera image from acamera viewing a scene containing at least a portion of said background;detecting said first and second tones in the camera output signal bychroma-keying; and deriving from results of said detecting step ameasure of translation and scale change.
 39. A method according to claim38, wherein the first and second tones differ in hue.
 40. A methodaccording to claim 38, wherein the first and second tones differ inbrightness.
 41. A method according to claim 38, wherein the first andsecond tones are of substantially equal brightness.
 42. A methodaccording to claim 38, wherein the pattern comprises a chequeredpattern.
 43. A method according to claim 38, wherein said chroma-keyingis used to remove foreground objects from the image prior todetermination of translation and scale change.
 44. A method according toclaim 38 wherein the camera output signal is spatially filtered prior tostorage and comparison.
 45. A method according to claim 38, wherein saidmeasures of translation and scale change are determined using a storedreference image.
 46. A method according to claim 45, wherein translationand scale change are determined by transforming the reference image andcomparing with the camera image.
 47. A method according to claim 46,wherein said transforming is based on a prediction of the translationand scale change.
 48. A method according to claim 45, whereintranslation and scale change are determined by computation of a numberof simultaneous equations each of which relate the translation and scalechange to spatial and temporal gradients at a point in the referenceimage and which are solved to yield a least-squares solution for themotion parameters.
 49. A method according to claim 45, wherein thestored reference image is replaced when the accumulated translation andscale change exceeds a given threshold.
 50. A method according to claim38, wherein translation and scale change are determined from a sub-setof the pixels in the camera image.
 51. A method according to claim 50,wherein the sub-set of pixels lie in a regular array in the cameraimage.
 52. A method according to claim 50, wherein the sub-set of pixelsare chosen to lie at points of high spatial gradient.
 53. A methodaccording to claim 38, wherein a stored reference image is compared tothe camera image after transforming the reference image based on aprediction of translation and scale change and a measure of translationand scale change derived from the comparison.
 54. A method according toclaim 53, wherein the reference image comprises an earlier camera image.55. A method according to claim 54, wherein the reference image isreplaced when the overlap between the reference image and the currentcamera image falls below a given proportion.
 56. A method according toclaim 38, wherein data obtained from a sensor on the camera is used inderivation of said measures of translation and scale change. 57.Apparatus for deriving at least one parameter representing the movementof a camera from a camera image, said camera viewing at least a portionof a background having a pattern comprising a plurality of areas of afirst tone and a plurality of areas of a second tone arranged so thattransitions between areas occur along at least two axes, whereby thetransitions provide reference points from which measures of bothtranslation and scale change may be determined, the apparatuscomprising: means for receiving an output signal from said camera, saidsignal defining a camera image; means for detecting said first andsecond tones in the camera image by chroma-keying; and means forderiving from results of said detecting means a measure of translationand scale change.
 58. Apparatus according to claim 57 further comprisingsaid background.
 59. Apparatus according to claim 57 further comprisingsaid camera.
 60. Apparatus according to claim 57 further comprising saidbackground and said camera.
 61. Apparatus according to claim 57, whereinthe first and second tones differ in hue.
 62. Apparatus according toclaim 57, wherein the first and second tones differ in brightness. 63.Apparatus according to claim 57, wherein the first and second tones areof substantially equal brightness.
 64. Apparatus according to claim 57,wherein the pattern comprises a chequered pattern.
 65. Apparatusaccording to claim 57, wherein said detecting means is arranged toremove foreground objects from the image prior to determination oftranslation and scale change.
 66. Apparatus according to claim 57comprising means for spatially filtering the camera image prior tostorage and comparison.
 67. Apparatus according to claims 57, furthercomprising an image store that stores a reference image for use by saidderiving means.
 68. Apparatus according to claim 57, wherein thederiving means is configured to determine translation and scale changeby transforming the reference image and comparing with the camera image.69. Apparatus according to claim 68, wherein said transforming is basedon a prediction of the translation and scale change.
 70. Apparatusaccording to claim 67, wherein translation and scale change aredetermined by computation of a number of simultaneous equations each ofwhich relate the translation and scale change to spatial and temporalgradients at a point in the reference image and which are solved toyield a least-squares solution for the motion parameters.
 71. Apparatusaccording to claims 67 wherein the stored reference image is replacedwhen the accumulated translation and scale change exceeds a giventhreshold.
 72. Apparatus according to claim 57, wherein said derivingmeans is configured to determine translation and scale change from asub-set of the pixels in the camera image.
 73. Apparatus according toclaim 72, wherein the sub-set of pixels lie in a regular array in thecamera image.
 74. Apparatus according to claim 72 wherein the sub-set ofpixels are chosen to lie at points of high spatial gradient. 75.Apparatus according to claim 57, further comprising a sensor on thecamera that provides data for use in derivation of said measures oftranslation and scale change.
 76. Apparatus according to claim 57,wherein said deriving means comprises digital signal processingapparatus.
 77. Apparatus for deriving at least one parameterrepresenting the movement of a camera from a camera image, said cameraviewing at least a portion of a background having a pattern comprising aplurality of areas of a first tone and a plurality of areas of a secondtone arranged so that transitions between areas occur along at least twoaxes, whereby the transitions provide reference points from whichmeasures of both translation and scale change may be determined, theapparatus comprising: an input receiving an output signal from saidcamera, said signal defining a camera image; and a signal processingsystem that 1 ) detects said first and second tones in the camera imageby chroma-keying; and 2 ) derives from the results of said detection ameasure of translation and scale change.
 78. Apparatus according toclaim 77 further comprising said background.
 79. Apparatus according toclaim 77 further comprising said camera.
 80. Apparatus according toclaim 77 further comprising said background and said camera. 81.Apparatus according to claim 77, wherein the first and second tonesdiffer in hue.
 82. Apparatus according to claim 77, wherein the firstand second tones differ in brightness.
 83. Apparatus according to claim77, wherein the first and second tones are of substantially equalbrightness.
 84. Apparatus according to claim 77, wherein the patterncomprises a chequered pattern.
 85. Apparatus according to claim 77,wherein said signal processing system removes foreground objects fromthe image prior to determination of translation and scale change. 86.Apparatus according to claim 77, wherein said signal processing systemdetermines translation and scale change by transforming the referenceimage and comparing with the camera image.
 87. Apparatus according toclaim 77, further comprising an image store that stores a referenceimage for use by said signal processing system.
 88. Apparatus accordingto claim 87 comprising a filter that spatially filters the camera imageprior to storage and comparison.
 89. Apparatus according to claim 86,wherein said transforming is based on a prediction of the translationand scale change.
 90. Apparatus according to claim 87, whereintranslation and scale change are determined by computation of a numberof simultaneous equations each of which relate the translation and scalechange to spatial and temporal gradients at a point in the referenceimage and which are solved to yield a least-squares solution for themotion parameters.
 91. Apparatus according to claim 87, arranged toreplace the stored reference image when the accumulated translation andscale change exceeds a given threshold.
 92. Apparatus according to claim77, wherein said signal processing system determines translation andscale change from a sub-set of the pixels in the camera image. 93.Apparatus according to claim 92, wherein the sub-set of pixels lie in aregular array in the camera image.
 94. Apparatus according to claim 92wherein the sub-set of pixels are chosen to lie at points of highspatial gradient.
 95. Apparatus according to claim 77, furthercomprising a sensor on the camera that provides data for use inderivation of said measures of translation and scale change. 96.Apparatus according to claim 77, wherein said deriving means comprisesdigital signal processing apparatus.
 97. The apparatus of claim 35wherein said areas of the background are quadrilaterals.
 98. The methodof claim 38 wherein said areas of the first tone and areas of the secondtone are quadrilaterals.
 99. The apparatus of claim 57 wherein saidareas of the first tone and areas of the second tone are quadrilaterals.100. The apparatus of claim 77 wherein said areas of the first tone andareas of the second tone are quadrilaterals.